A Hybrid Word-Character Model for Abstractive Summarization
نویسندگان
چکیده
Abstractive summarization is the popular research topic nowadays. Due to the difference in language property, Chinese summarization also gains lots of attention. Most of studies use character-based representation instead of word-based to keep out the error introduced by word segmentation and OOV problem. However, we believe that word-based representation can capture the semantics of the articles more accurately. We proposed a hybrid word-character model preserves the advantage of both word-based and characterbased representations. Our method also enables us to use larger word vocabulary size than anyone else. We call this new method HWC (Hybrid Word-Character). We conduct the experiments on LCSTS Chinese summarization dataset, and outperform the current state-of-the-art by at least 8 ROUGE points.ive summarization is the popular research topic nowadays. Due to the difference in language property, Chinese summarization also gains lots of attention. Most of studies use character-based representation instead of word-based to keep out the error introduced by word segmentation and OOV problem. However, we believe that word-based representation can capture the semantics of the articles more accurately. We proposed a hybrid word-character model preserves the advantage of both word-based and characterbased representations. Our method also enables us to use larger word vocabulary size than anyone else. We call this new method HWC (Hybrid Word-Character). We conduct the experiments on LCSTS Chinese summarization dataset, and outperform the current state-of-the-art by at least 8 ROUGE points.
منابع مشابه
A Hybrid Approach to Multi-document Summarization of Opinions in Reviews
We present a hybrid method to generate summaries of product and services reviews by combining natural language generation and salient sentence selection techniques. Our system, STARLET-H, receives as input textual reviews with associated rated topics, and produces as output a natural language document summarizing the opinions expressed in the reviews. STARLET-H operates as a hybrid abstractive/...
متن کاملTL;DR: Improving Abstractive Summarization Using LSTMs
Traditionally, summarization has been approached through extractive methods. However, they have produced limited results. More recently, neural sequence-tosequence models for abstractive text summarization have shown more promise, although the task still proves to be challenging. In this paper, we explore current state-of-the-art architectures and reimplement them from scratch. We begin with a ...
متن کاملMulti-Document Abstractive Summarization Using ILP Based Multi-Sentence Compression
Abstractive summarization is an ideal form of summarization since it can synthesize information from multiple documents to create concise informative summaries. In this work, we aim at developing an abstractive summarizer. First, our proposed approach identifies the most important document in the multi-document set. The sentences in the most important document are aligned to sentences in other ...
متن کاملA Neural Attention Model for Abstractive Sentence Summarization
Summarization based on text extraction is inherently limited, but generation-style abstractive methods have proven challenging to build. In this work, we propose a fully data-driven approach to abstractive sentence summarization. Our method utilizes a local attention-based model that generates each word of the summary conditioned on the input sentence. While the model is structurally simple, it...
متن کاملAutomatic Community Creation for Abstractive Spoken Conversations Summarization
Summarization of spoken conversations is a challenging task, since it requires deep understanding of dialogs. Abstractive summarization techniques rely on linking the summary sentences to sets of original conversation sentences, i.e. communities. Unfortunately, such linking information is rarely available or requires trained annotators. We propose and experiment automatic community creation usi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1802.09968 شماره
صفحات -
تاریخ انتشار 2018